Search CORE

CiteSeerX

Oxford University Research Archive

Principles of Experimental Evaluation

Author: Manolescu Ioana
Publication venue: HAL CCSD
Publication date: 09/06/2014
Field of study

International audienceLa validation expérimentale est devenue incontournable pour faireaccepter les conclusions d'une étude scientifique, en particulier dansle domaine de la gestion de données. En même temps, mener à bien uneétude expérimentale complète et crédible demande de maîtriser unesérie de techniques et bonnes pratiques qui visent à organiser l'étudeexpérimentale, à analyser les résultats pour bien les interpréter, età rendre les expériences faciles à reproduire, afin de permettre àd'autres de les analyser et/ou de les refaire.Le tutoriel se concentre sur deux aspects liés à l'expérimentation, àsavoir la planification des expériences, et les bonnes pratiques afinde rendre les expériences reproductibles. Une version étendue en a étéprésentée aux conférences ICDE 2008 et EDBT 2009

Teaching an RDBMS about ontological constraints

Author: Bursztyn Damian
Goasdoué François
Manolescu Ioana
Publication venue: HAL CCSD
Publication date: 05/09/2016
Field of study

International audienceIn the presence of an ontology, query answers must reflect not only data explicitly present in the database, but also implicit data, which holds due to the ontology, even though it is not present in the database. A large and useful set of ontology languages enjoys FOL reducibility of query answering: answering a query can be reduced to evaluating a certain first-order logic (FOL) formula (obtained from the query and ontology) against only the explicit facts. We present a novel query optimization framework for ontology-based data access settings enjoying FOL reducibility. Our framework is based on searching within a set of alternative equivalent FOL queries, i.e., FOL reformulations, one with minimal evaluation cost when evaluated through a relational database system. We apply this framework to the DL-LiteR Description Logic underpinning the W3C's OWL2 QL ontology language, and demonstrate through experiments its performance benefits when two leading SQL systems, one open-source and one commercial, are used for evaluating the FOL query reformulations

HAL-Polytechnique

Query-Oriented Summarization of RDF Graphs

Author: Goasdoué François
Manolescu Ioana
Čebirić Šejla
Publication venue: HAL CCSD
Publication date: 13/04/2015
Field of study

International audienceThe Resource Description Framework (RDF) is the W3C’s graph data model for Semantic Web applications. We study the problem of RDF graph summarization: given an input RDF graph G, find an RDF graph G' which summarizes G as accurately as possible, while being possibly orders of magnitude smaller than the original graph. Our approach is query-oriented, i.e., querying a summary of a graph should reflect whether the query has some answers against this graph. The summaries are aimed as a help for query formulation and optimization. We introduce two summaries: a baseline which is compact and simple and satisfies certain accuracy and representativeness properties, but may oversimplify the RDF graph, and a refined one which trades some of these properties for more accuracy in representing the structure

HAL-Polytechnique

Algebraic incremental maintenance of XML views

Author: Bonifati Angela
Goodfellow Martin
Manolescu Ioana
Sileo Domenica
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/04/2013
Field of study

International audienceMaterialized views can bring important performance benefits when querying XML documents. In the presence of XML document changes, materialized views need to be updated to faithfully reflect the changed document. In this work, we present an algebraic approach for propagating source updates to XML materialized views expressed in a powerful XML tree pattern formalism. Our approach differs from the state of the art in the area in two important ways. First, it relies on set-oriented, algebraic operations, to be contrasted with node-based previous approaches. Second, it exploits state-of-the-art features of XML stores and XML query evaluation engines, notably XML structural identifiers and associated structural join algorithms. We present algorithms for determining how updates should be propagated to views, and highlight the benefits of our approach over existing algorithms through a series of experiments

University of Strathclyde Institutional Repository

HAL - Lille 3

arXiv.org e-Print Archive

Hal-Diderot

XML content warehousing: Improving sociological studies of mailing lists and web data

Author: Colazzo Dario
Dudouet François-Xavier
Manolescu Ioana
Nguyen Benjamin
Senellart Pierre
Vion Antoine
Publication venue
Publication date: 01/01/2011
Field of study

In this paper, we present the guidelines for an XML-based approach for the sociological study of Web data such as the analysis of mailing lists or databases available online. The use of an XML warehouse is a flexible solution for storing and processing this kind of data. We propose an implemented solution and show possible applications with our case study of profiles of experts involved in W3C standard-setting activity. We illustrate the sociological use of semi-structured databases by presenting our XML Schema for mailing-list warehousing. An XML Schema allows many adjunctions or crossings of data sources, without modifying existing data sets, while allowing possible structural evolution. We also show that the existence of hidden data implies increased complexity for traditional SQL users. XML content warehousing allows altogether exhaustive warehousing and recursive queries through contents, with far less dependence on the initial storage. We finally present the possibility of exporting the data stored in the warehouse to commonly-used advanced software devoted to sociological analysis

Base de publications de l'université Paris-Dauphine

HAL AMU

arXiv.org e-Print Archive

The WebStand Project

Author: Colazzo Dario
Dudouet François-Xavier
Manolescu Ioana
Nguyen Benjamin
Senellart Pierre
Vion Antoine
Publication venue
Publication date: 01/01/2009
Field of study

In this paper we present the state of advancement of the French ANR WebStand project. The objective of this project is to construct a customizable XML based warehouse platform to acquire, transform, analyze, store, query and export data from the web, in particular mailing lists, with the final intension of using this data to perform sociological studies focused on social groups of World Wide Web, with a specific emphasis on the temporal aspects of this data. We are currently using this system to analyze the standardization process of the W3C, through its social network of standard setters

Base de publications de l'université Paris-Dauphine

CiteSeerX

HAL AMU

Social, Structured and Semantic Search

Author: Bonaque Raphaël
Cautis Bogdan
Goasdoué François
Manolescu Ioana
Publication venue: HAL CCSD
Publication date: 15/03/2016
Field of study

International audienceSocial content such as blogs, tweets, news etc. is a rich source of interconnected information. We identify a set of requirements for the meaningful exploitation of such rich content, and present a new data model, called S3, which is the first to satisfy them. S3 captures social relationships between users, and between users and content, but also the structure present in rich social content, as well as its semantics. We provide the first top-k keyword search algorithm taking into account the social, structured, and semantic dimensions and formally establish its termination and correctness. Experiments on real social networks demonstrate the efficiency and qualitative advantage of our algorithm through the joint exploitation of the social, structured, and semantic dimensions of S3

HAL-Polytechnique

Dagstuhl Research Online Publication Server

The XQueC Project: Compressing and Querying XML

Author: Arion Andrei
Bonifati Angela
Manolescu Ioana
Pugliese Andrea
Publication venue: Dagstuhl Seminar Proceedings. 08261 - Structure-Based Compression of Complex Massive Data
Publication date: 01/01/2008
Field of study

RDF Analytics: Lenses over Semantic Graphs

Author: Colazzo Dario
Goasdoué François
Manolescu Ioana
Roatis Alexandra
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

International audienceThe development of Semantic Web (RDF) brings new requirements for data analytics tools and methods, going beyond querying to semantics-rich analytics through warehouse-style tools. In this work, we fully redesign, from the bottom up, core data analytics concepts and tools in the context of RDF data, leading to the first complete formal framework for warehouse-style RDF analytics. Notably, we define i) analytical schemas tailored to heterogeneous, semantics-rich RDF graph, ii) analytical queries which (beyond relational cubes) allow flexible querying of the data and the schema as well as powerful aggregation and iii) OLAP-style operations. Experiments on a fully-implemented platform demonstrate the practical interest of our approach

Base de publications de l'université Paris-Dauphine

Crossref